Bayesian Optimal Control of Smoothly Parameterized Systems
نویسندگان
چکیده
We study Bayesian optimal control of a general class of smoothly parameterized Markov decision problems (MDPs). We propose a lazy version of the so-called posterior sampling method, a method that goes back to Thompson and Strens, more recently studied by Osband, Russo and van Roy. While Osband et al. derived a bound on the (Bayesian) regret of this method for undiscounted total cost episodic, finite state and action problems, we consider the continuing, average cost setting with no cardinality restrictions on the state or action spaces. While in the episodic setting, it is natural to switch to a new policy at the episode-ends, in the continuing average cost framework we must introduce switching points explicitly and in a principled fashion, or the regret could grow linearly. Our lazy method introduces these switching points based on monitoring the uncertainty left about the unknown parameter. To develop a suitable and easy-to-compute uncertainty measure, we introduce a new “average local smoothness” condition, which is shown to be satisfied in common examples. Under this, and some additional mild conditions, we derive rate-optimal bounds on the regret of our algorithm. Our general approach allows us to use a single algorithm and a single analysis for a wide range of problems, such as finite MDPs or linear quadratic regulation, both being instances of smoothly parameterized MDPs. The effectiveness of our method is illustrated by means of a simulated example.
منابع مشابه
Bayesian Optimal Control of Smoothly Parameterized Systems: The Lazy Posterior Sampling Algorithm
We study Bayesian optimal control of a general class of smoothly parameterized Markov decision problems. Since computing the optimal control is computationally expensive, we design an algorithm that trades off performance for computational efficiency. The algorithm is a lazy posterior sampling method that maintains a distribution over the unknown parameter. The algorithm changes its policy only...
متن کاملFuzzy adaptive tracking control for a class of nonlinearly parameterized systems with unknown control directions
This paper addresses the problem of adaptive fuzzy tracking control for aclass of nonlinearly parameterized systems with unknown control directions.In this paper, the nonlinearly parameterized functions are lumped into the unknown continuous functionswhich can be approximated by using the fuzzy logic systems (FLS) in Mamdani type. Then, the Nussbaum-type function is used to de...
متن کاملPontryagin's Minimum Principle for Fuzzy Optimal Control Problems
The objective of this article is to derive the necessary optimality conditions, known as Pontryagin's minimum principle, for fuzzy optimal control problems based on the concepts of differentiability and integrability of a fuzzy mapping that may be parameterized by the left and right-hand functions of its $alpha$-level sets.
متن کاملADAPTIVE FUZZY TRACKING CONTROL FOR A CLASS OF PERTURBED NONLINEARLY PARAMETERIZED SYSTEMS USING MINIMAL LEARNING PARAMETERS ALGORITHM
In this paper, an adaptive fuzzy tracking control approach is proposed for a class of single-inputsingle-output (SISO) nonlinear systems in which the unknown continuous functions may be nonlinearlyparameterized. During the controller design procedure, the fuzzy logic systems (FLS) in Mamdani type are applied to approximate the unknown continuous functions, and then, based on the minimal learnin...
متن کاملDelay-Dependent Robust Asymptotically Stable for Linear Time Variant Systems
In this paper, the problem of delay dependent robust asymptotically stable for uncertain linear time-variant system with multiple delays is investigated. A new delay-dependent stability sufficient condition is given by using the Lyapunov method, linear matrix inequality (LMI), parameterized first-order model transformation technique and transformation of the interval uncertainty in to the norm ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015